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Abstract 


Since the start of COVID-19 there have been widely spread notions that men 
can contract the disease at a higher rate than women and are more likely to 
die from it. Furthermore, age has been reported as a risk factor in COVID-19 
mortality, and a vast majority of COVID-19 deaths have been among elderly 
people. Therefore, it is interesting to conduct a statistical analysis on COVID-19 
deaths to draw conclusions on whether men are more likely to die of COVID-19 
than women, and the influence of age groups on the number of reported 
deaths of COVID-19 and the interaction between these two factors. This pa- 
per uses a two-way analysis of variance (two-way ANOVA) as a statistical 
tool to analyze COVID-19 deaths in the US. Two-way ANOVA can effective- 
ly determine whether the age and gender are significant factors in COVID-19 
death cases in the US. The dependent variable in the analysis is the number of 
COVID deaths in the entire US, and the two independent variables are the 
age groups and gender (sex). The age groups consist of 11 subgroups or levels 
ranging from babies to elderly people. The sex is either male or female. Re- 
sults showed that age group is a significant factor in COVID deaths, while 
gender was found to be insignificant factor in the mortality of COVID. 


Subject Areas 
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Keywords 
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1. Introduction 


COVID-19 pandemic has impacted all aspects of our daily life including educa- 
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tion, culture, employment, communication, sports, shopping, and other areas. The 
pandemic has changed how we work, learn, and interact as social distancing guide- 
lines have led to a more virtual existence, both personally and professionally. 
There is a widely spread notion that men are more vulnerable to COVID-19 
than women, and more men than women have died of COVID-19. Furthermore, 
the age is reported as a significant risk factor in COVID-19 mortality, and a vast 
majority of COVID-19 deaths have been among people older than 70. Therefore, 
the role of gender in COVID-19 mortality has to be investigated along with the 
age groups. It is vital to consider gender as a biological variable in the preven- 
tion, and care of the virus. Additionally, understanding the effects of the age 
groups associated with gender and the interaction between them is just as im- 
portant. Differences in sex are biological. These include differences in reproduc- 
tive organs and their functions, sexual hormones, and the gene expression of 
chromosomes [1] [2] [3] [4]. Gender is the performance of socially constructed 
roles, behaviors, and attributes considered socially acceptable for men and women. 
In addition to sex differences in immune responses, hormones, and genes, there 
are also psychological, social, and behavioral components that influence COVID-19 
progression. Compared with women, men tend to engage in more high-risk be- 
haviors that generate the potential for contracting COVID-19. There might be a 
range of biological, psychological, and behavioral factors that can explain why 
men have higher rates of COVID-19-associated morbidity and mortality than 
women [5] [6]. Therefore, we recognize that gender and age groups are impor- 
tant and should call attention to the COVID-19 reported deaths. Knowing the 
impact of age groups and gender on coronavirus deaths would help in making 
precautions in advance for those people who are more vulnerable to the disease, 


like prioritizing vaccinations and health care services for those groups. 


2. Data 


Data are obtained from the “Provisional COVID-19 Deaths by Sex and Age” 
datasets available for the public at data.gov and cdc.gov websites. The data 
contains the number of COVID-19 deaths, and other diseases, such as pneu- 
monia, and influenza reported to the National Center for Health Statistics 
(NCHS) by sex, age group, and jurisdiction of occurrence. Data we used con- 
tained COVID-19 deaths recorded from 1/1/2020 to 11/09/2022. COVID death 
records are available for the entire US, and for each individual state as well. This 
paper uses the recorded COVID deaths for the entire USA in the analysis. Data 
was carefully inspected and cleaned. Missing values and outliers were removed. 
Only COVID-19 deaths were used in the analysis including both independent 
variables; age groups and sex. The age groups have 11 subgroups (levels) as 
shown in Figure 1. These subgroups are; (0 - 1), (1 - 4), (5 - 14), (15 - 24), (25 - 
34), (35 - 44), (45 - 54), (55 - 64), (65 - 74), (75 - 84), and (85 years and above). 
The sex variable includes males and females. Data used in the analysis is pre- 
sented in Table 1. 
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3. Exploratory Data Analysis 


First, a summary statistics for the variables in the analysis is conducted as shown 
in Table 2. 

We can see from Table 2 that the mean of “COVID deaths” in the dataset is 
48,524.5, the median is 14,490, and the standard deviation is 58,978.6. The mean 
of the “age group” is 40.86, the median is 40.0, and the standard deviation is 
30.28. The mean of the variable “sex” is 1.5, the median is 1.5, and the standard 
deviation is 0.52. 

Exploratory data analysis is conducted to discover the hidden patterns in the 
data and the relationships between the variables. The normal distribution curve 
of the response variable (number of COVID deaths in the US) is generated 


ae RR ate ww 


0- 1-4 5-14 15-24 25-34 35-44 45-54 55-64 65-74 75-84 >84 
Levels of Age Groups 


Figure 1. Levels of age groups in COVID-19 data. 


Table 1. Data used in the analysis. 


Sex Age groups (years) COVID deaths 


Sex Age groups years COVID deaths 


Male 0-1 211 Female 0-1 169 
Male 1-4 106 Female 1-4 96 
Male 5-14 213 Female 5-14 210 
Male 15 - 24 1688 Female 15 - 24 1154 
Male 25 - 34 7241 Female 25 - 34 4578 
Male 35 - 44 17,966 Female 35 - 44 11,014 
Male 45 -54 44,132 Female 45 - 54 25,010 
Male 55 - 64 94,375 Female 55 - 64 59,025 
Male 65 - 74 143,909 Female 65 - 74 98,568 
Male 75 - 84 155,128 Female 75 - 84 121,525 
Male >84 123,830 Female >84 157,391 
Table 2. Summary statistics of the variables included in the analysis. 
Variable Mean Median St.Dev. Min Max 1*Qu. 3%Qu. 
Age Group 40.86 40.0 30.28 0.5 95 16.5 71.5 
Sex 1.5 1.5 0.52 1 2 1 2 
COVID Deaths  48,524.5 14,490 58,978.6 96 157,391 448.2 97,519.8 
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using the R software as shown in Figure 2. We can see from Figure 2 that the 
top of the curve is centered at approximately (50,000 COVID deaths), and it is 
skewed to the right. 

Figure 3 shows a bar chart presenting the number of COVID deaths by age 
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Figure 3. Number of COVID deaths by age groups for males. 
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groups for males, and Figure 4 shows a bar chart presenting the number of 
COVID deaths by age groups for females. Figure 5 and Figure 6 show pie charts 
showing the percentages of COVID deaths by age groups for males and females, 
respectively. We can see from both pie charts that babies and children suffer 
very small percentages of COVID deaths compared to elderly people. For exam- 
ple, the percentages of COVID deaths of age groups (0 - 1, 1 - 4, 5 - 14) for both 
males and females are (0.04%, 0.02%, 0.04%) respectively, while the percentages 
of COVID deaths for age groups (65 - 74, 75 - 84) are (24.4%, 26.3%) for males 
and (20.6%, 25.4%) for females. 

Figure 7 shows a line plot of the COVID-19 deaths by the age groups for both 
males and females together in the US. We can see from Figure 7 that the cases of 
COVID deaths are relatively low for groups (0 - 1, 1 - 4, 5 - 14, 15 - 24). Howev- 
er, they start to increase from group (25 - 34) until the last age group (85 yr. and 


Covid Deaths by Females Age Groups 


Females Age Groups 


Figure 4. Number of COVID deaths by age groups for females. 


COVID-19 Deaths by Age Groups - Males 


as 


© 15-24 0.29% 
@ 25-34 1.23% 
@ 35-44 3.05% 
@ 45-54 7.50% 
@ 55-64 16.03% 
@ 65-74 24.40% 
@ 75-84 26.30% 
@>84 21.00% 


Age Groups 
0-1: 0.04% 
1-4: 0.02% 
5-14: 0.04% 


y 24.4% 


Figure 5. Percentages of COVID deaths by age groups for males. 
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COVID-19 Deaths by Age Groups - Females 
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Figure 6. Percentages of COVID deaths by age groups for females. 
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Figure 7. Line plot of COVID-19 deaths by age groups for both males and females to- 
gether. 


above). The increase in COVID deaths is apparent for groups of elderly (55 - 64, 
65 - 74, 75 - 84, 85 and above). This indicates that the age group has impact on 
the number of deaths. As the age group increases the number of COVID deaths 
increases as well. 

Figure 8 shows the COVID deaths for males and females by age groups sepa- 
rately. We can see from Figure 8 that the total number of deaths is almost identic- 
al for both males and females among the first three age groups (0 - 1, 1 - 4, 4 - 15). 
Then, the number of COVID deaths increases for both males and females until 
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COVID-19 Deaths for Males and Females by Age Groups in the US 


Age Group 


Figure 8. COVID-19 deaths for males and females by age groups separately. 


the last group. However, the number of COVID deaths for males is slightly more 
than females for all age groups, except for the last group (85 and above), where 
the number of deaths for females is larger than males as can be clearly seen. This 
indicates that males had experienced more fatality from COVID than females, 
however, we have to figure out if this is statistically significant by employing the 
Two-way ANOVA procedure in the analysis. 

Boxplots of COVID-19 deaths by age groups for both males and females to- 
gether are shown in Figure 9. A boxplot is a useful visualization tool that can 
show the minimum, 1“ quartile, median, 3"! quartile, and maximum values. We 
can notice from Figure 9 that starting from group 5 (25 - 34) the number of 
COVID deaths increases with a median value of (5909.5). The median for group 
6 (35 - 44) is (14,490), the median for group 7 (45 - 54) is (34,571). The median 
for group 8 (55 - 64) is (76,700). The median for group 9 (65 - 74) is (121,238.5). 
The median for group 10 (75 - 84) is (138,326.5). The median for group 11 (85 
yrs. and above) is (140,610.5). As the age group increases, the number of COVID 
deaths increases with their median values as well. 

Figure 10 shows the COVID-19 deaths by gender (sex) types in the US. We 
can see from Figure 10 that the median number of COVID deaths for males is 
(17,966), whereas for females, the median is smaller than males (11,014). The mean 
(average) of COVID deaths for males is (53,527.18), while the mean for females 
is only (43,521.82). The third quartile (Q3) for males is (123,830), while for females 
is only (98,568). The interquartile range (IQR = Q3 — Q1) for males is (123,617), 
while for females, the IQR is only (98,358). Again, this indicates that males had 
experienced more fatality from COVID-19 than females, however, we have to 


determine if this is statistically significant or not through the ANOVA method. 
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Boxplots of COVID-19 Deaths by Age Groups in the US 
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Figure 9. Boxplots of COVID-19 deaths by age groups in the US. 


Boxplots of COVID-19 Deaths by Males and Females 
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Figure 10. Boxplots of COVID-19 deaths by gender types in the US. 


4. Overview on the Analysis of Variance (ANOVA) 


ANOVA, which stands for “Analysis of Variance”, is a statistical test used to 
analyze the differences between the means of more than two groups. A one-way 
ANOVA uses one independent variable, while a two-way ANOVA uses two in- 
dependent variables. MANOVA uses multiple independent variables. ANOVA 
can determine if the dependent variable changes according to the level of the in- 
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dependent variable. The statistic generated by ANOVA is called the F-statistic, 
which is, the ratio of between-treatments variance to within-treatment variance 
[7]-[12]. Within-treatment variance refers to the variability within a particular 
sample. There are two sources that contribute to the variability within a sample: 
individual differences and experimental error. Both of these sources of variance 
are considered to be random error variance because they are unintentional and 
not the result of planning or design. Between-treatments variance refers to the 
variability between the treatment groups. The same two factors that contributed 
to within-treatment variance (individual differences and experimental error) al- 
so contribute to between-treatments variance. However, there is an additional 
source that contributes to between-treatments variance: treatment effects. If the 
independent variable had no influence on the dependent variable, the value of 
the F-statistic would be approximately 1. Conversely, if there were treatment ef- 
fects that created large differences between group means, then the treatment va- 
riance would be bigger than the within-treatment variance and the value of F 
would be larger than 1. The within-treatment variance is also referred to as error 
variance and between-treatments variance is referred to as treatment variance 
[13] [14] [15] [16]. Thus, another way of stating the F-statistic is: 


F = Treatment variance/Error variance. 


A typical ANOVA table is usually presented with the terms shown in Table 3: 
where: 

Between Groups Degrees of Freedom: DF = &- 1, where kis the number of 
groups; 

Within Groups Degrees of Freedom: DF = N— &, where Nis the total number 
of subjects; 

Total Degrees of Freedom: DF = N- 1; 

Sum of Squares Between Groups: SS, = yon (x; - x) » where a, is the 
number of subjects in the th group; 

Sum of Squares Within Groups: SS, = yn —1)S;, where S;, is the stan- 
dard deviation of the +th group; 

Total Sum of Squares: SS., = SS, + SS; 

Mean Square Between Groups: MS, = SS,/(«- 1); 

Mean Square Within Groups: MS = SSy/(N— 4); 

F-Statistic (or F-ratio): F = MS,/MSy. 

Additionally, in conducting an ANOVA, there should be a hypothesis testing, 
the null hypothesis predicts that the value of F will be 1.00 because, it assumes 


Between Groups 


Source Degrees of Freedom (DF) Sum of Squares (SS) Mean of Squares (MS) F-statistic p-value 
SS, MS, = SS,/(k- 1) F = MS,/MS 
SSwy MSw = SSy/(N- &) 


Within Groups 


Total 


SS, = SS, + SSy 
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no difference between means; and that any differences found between means 
were simply due to chance, or random error. The alternative hypothesis assumes 
that there will be significant differences between some of the means. F-statistics 
will not have negative values because they are calculated from ratios of variances, 
which are squared scores. Thus, F-values will only be positive. The ANOVA alone 
does not tell us specifically which means are different from one another. To de- 
termine that, we would need to follow up with multiple comparisons (or post-hoc) 
test. ANOVA makes the following assumptions [7] [8] [11] [14] [15]: 

1) Normality: Each sample is drawn from a normally distributed population. 

2) Equal Variances: The variances of the populations that the samples come 
from are equal. 

3) Independence: The observations in each group are independent of each 


other and the observations within groups are obtained by a random sample. 


5. Methodology Steps and Discussion 


We have conducted the following steps in the analysis: 

Step 1: Hypothesis testing 

We will test three hypothesis in this research as follows: 

1* null hypothesis: H,,: There is no difference in COVID deaths for any age 
group. 

1“ alternative hypothesis: There is a difference in COVID deaths by age group. 

2 null hypothesis: H,,: There is no difference in COVID deaths at either sex 
level (males or females). 

24 alternative hypothesis: There is a difference in COVID deaths by sex level. 

3™ null hypothesis: H,;: The effect of one independent variable on COVID 
deaths does not depend on the effect of the other independent variable (ie, no 
interaction effect). 

3" alternative hypothesis: There is an interaction effect between age group and 
sex on COVID deaths. 

Step 2: Fitting a two-way ANOVA with interaction term 

We run a two-way ANOVA using the R software with interaction term be- 
tween the age groups and sex. The outcome is shown in Table 4. 


e We can see from Table 4 that the p-value for the independent variable “age 


Table 4. Outcome of two-way ANOVA with interaction term. 


Source Sum of Squares (SS) Degrees of Freedom (DF) Mean Squares (MS) F-statistic p-value 
age groups 70,060,000,000 10 70,060,000,00 25.76 3.32e—06*** 
sex 4,741,000 1 4,741,000 0.166 0.689 
age group * sex (interaction) 3,058,000 1 3,058,000 0.107 0.748 
residuals 515,300,000 18 28,630,000 


** significant at 0.001. 
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group” is very small and significant at 0.001 level. So, the age group is statis- 
tically significant. 

The p-value for the independent variable “sex” is 0.689 > 0.05. So, the sex va- 
riable is statistically insignificant at the significance level of 0.05. 

The p-value for the interaction term between age group and sex is 0.748 > 0.05. 
So, the interaction term is statistically insignificant at the alpha level of 0.05. 
These results indicate that the “age group” is the only risk factor that has a 
statistically significant effect on COVID deaths. The sex and interaction term 
have no significant effects on COVID deaths. 

Based on these results, we reject the 1* null hypothesis and accept the 1% al- 
ternative hypothesis and conclude that there is a difference in COVID deaths 
by age groups. 

Also, we cannot reject the 2°‘ and 3% null hypotheses, and conclude that 
there is no difference in COVID deaths at either sex level (males or females), 
and the interaction between age group and sex has no significant effect on 
COVID deaths as well. 


Step 3: Checking the assumptions of the two-way ANOVA 
Two-way ANOVA makes some assumptions that should be checked in order 


to be sure that the procedure was well fitted to our data. We can check these as- 


sumptions as follows: 


1) Homogeneity of variance (or homoscedasticity): We can use the residuals 


versus fitted plot to check the homogeneity of variances. The plot is generated in 


R as shown in Figure 11. We can see from Figure 11 that the red line of the 


Normal Q-Q 


2 4 

g _ 

vt nS) — 
3 ma 2 
3 °o 
Zo 7 

3 OM 

far Hoo 

So Ee} 1 

S | 

= go 

=  @ 

O0e+00 5e+04 1e+05 -2 -1 0 1 2 
Fitted values Theoretical Quantiles 
Scale-Location Residuals vs Leverage 

zz] n 
g 9 z 
3 x 
7] ; aoe 
3 (C(O a) 
S x ° 
y= z ; 
a) a Cook’s distance 
uo] uo) 1 
= g 
£ 9° 8 
a os n 


Oe+00 


5e+04 


Fitted values 


1e+05 0.00 0.05 0.10 0.15 


Leverage 


Figure 11. Plot of residuals vs fitted, normal Q-Q plot, and residuals vs leverages. 
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residuals vs fitted values is almost straight, so, there are no evident relationships 
between residuals and fitted values. This indicates that the variation around the 
mean for each group of residuals being compared is almost similar among all 
groups. So, we can assume that the homogeneity of variances has been met. To 
further support our assumption, we conducted Levene’s test to check the homo- 
geneity of variances in R, and the results are shown in Table 5. 

We can notice from Table 5 that the p-value (0.148 > 0.05), which means that 
there is no evidence to suggest that the variance across groups is significantly 
different. Therefore, we can assume that the homogeneity of variances in the 
different age groups has been met. 

2) In order to check the normality assumption, we can see from Figure 7 that 
the normal quartile-quartile (Q-Q) plot approximately follows an inclined straight 
line, and all the points fall approximately along this reference line, so, we can 
assume normality distribution. To support our assumption, we conducted the 
Shapiro-Wilk normality test on the ANOVA residuals in R, and the test results 
are shown in Table 6. 

We can see from Table 6 that the p-value of the test is (0.6927 > 0.05), which 
indicates that the normality of the residuals has been met. 

3. We also assume that the observations in each group are independent of 
each other when data were collected and the observations within groups were 
obtained by a random sample, so that the ANOVA assumption of independence 
is met as well. 

Step 4: Effect Size 

We can use the “Partial eta squared” to measure the effect size of different va- 
riables in two-way ANOVA models. It measures the proportion of variance ex- 
plained by a given variable of the total variance remaining after accounting for 
variance explained by other variables in the model. When there is only one pre- 
dictor variable in the model (ie., a one-way ANOVA), then the value for eta 
squared and partial eta squared will be equal [17]-[25]. Partial eta squared is 
calculated as follows: 


Partial eta squared = SSerrect! (SS erect + SSerror) 


where: 


Table 5. The outcome of Levene’s test for homogeneity of variances. 


Levene’s Test for Homogeneity of Variance (center = median) 


DF F-value p-value 


group 21 1.7086 0.1484 


Table 6. The outcome of Shapiro- Wilk normality test. 


Shapiro-Wilk normality test 


WwW p-value 


0.9692 0.6927 
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© SS.rect? The sum of squares of an effect for one variable. 
© SS.rort The sum of squares error in the ANOVA model. 

The value for Partial eta squared ranges from 0 to 1, where values closer to 1 
indicate a higher proportion of variance that can be explained by a given variable 
in the model after accounting for variance explained by other variables in the 
model. 

The following rules of thumb are used to interpret values for Partial eta 
squared [18]: 

0.01 or smaller: small effect size. 

0.06: medium effect size. 

0.14 or higher: large effect size. 

Table 7 shows the values of the partial eta squared for variables, age groups 
and sex. 

We can see from Table 7 that the age group variable has a very large effect 
size, but the sex variable has a very small effect size. 

Step 5: Post-hoc Test 

ANOVA outcome can tell which variable is significant, but not which levels of 
the variable are different from one another. To determine this, we can use a 
post-hoc test. The Tukey’s Honestly-Significant-Difference (Tukey HSD) test 
can effectively identify which groups are significantly different from one anoth- 
er. Tukey’s HSD focuses on the difference between the groups with the largest 
and smallest means. If the difference is less than or equals a margin of error for 
the difference in the means, then the confidence intervals for that difference in 
the means will contain zero value. If the confidence interval contains zero, then 
the group pairs do not differ significantly. If the confidence interval does not 
cover zero, then the group pairs significantly differ. A Tukey HSD test is con- 
ducted in R. The outcome shows the pairwise differences between the 11 levels 
of age groups with the average difference, the lower and upper bounds of the 
95% confidence interval and the p-value of the difference. There were 55 pair- 
wise groups. In order to graphically illustrate the Tukey HSD outcome, we gen- 
erated the Tukey HSD plot of 95% family-wise confidence level for the variable 
age groups in R as shown in Figure 12. The significant groupwise differences are 
anywhere the 95% confidence interval does not include zero. This is another way 
of saying that the p-value for these pairwise differences is <0.05. 

From the outcome of the Tukey HSD test and the 95% family-wise plot, we 
can determine the following significant levels of the variable “age groups” as 
shown in Table 8. 


We can realize from Table 8 that only 26 pairwise levels of the “age group” 


Table 7. Effect size of variables in two-way ANOVA. 


Variable Partial eta squared Interpretation 
age group 0.9926 Large effect size 
sex 0.0091 Small effect size 
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95% family-wise confidence level 
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Figure 12. The 95% family-wise confidence level of pairwise differences for age groups. 


Table 8. The significant pairwise levels of the age groups. 


ane aignificant heceaae 95% Confidence 95% Confidence 

pairwise levels Difference Interval Interval p-value 

of age groups Lower Bound Upper Bound 
8-1 76,510 11,149.57 141,870 0.018 
9-1 121,048.5 55,688.07 186,408 0.0004 
10-1 138,136.5 72,776.07 203,496 0.0001 
11-1 140,420.5 75,060.08 205,780 0.0001 
8-2 76,599 11,238.57 141,959 0.0179 
9-2 121,137.5 55,777.07 186,497 0.0004 
10-2 138,225.5 72,865.06 203,585 0.0001 
11-2 140,509.5 75,149 205,869 0.0001 
8-3 76,488 11,128 141,848 0.0181 
9-3 121,027 55,666 186,378 0.0004 
10-3 138,115 72,754 203,475 0.0001 
11-3 140,399 75,038 205,759 0.0001 
8-4 75,279 9918 140,639 0.0202 
9-4 119,817 54,457 185,177 0.0004 
10-4 136,905 71,545 20,226 0.0001 
11-4 139,189 73,829 204,549 0.0001 
8-5 70,790 5430 136,150 0.0304 
9-5 115,329 49,986 180,689 0.0006 
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Continued 
10-5 132,417 67,056 197,777 0.0001 
11-5 134,701 69,340 200,061 0.0001 
9-6 106,748 41,388 172,108 0.0013 
10-6 123,836 58,476 189,196 0.0003 
11-6 126,120 60,760 191,480 0.0003 
9-7 86,667 21,307 152,027 0.007 
10-7 103,755 38,395 169,115 0.001 
11-7 106,039 40,679 171,399 0.001 


variable is statistically significant out of a total of 55 pairwise levels. The signifi- 
cant pairwise age groups mainly included old people, babies, children, and young 
adults, such as 8-1, 8-2, 8-3, 9-1, 9-2, 9-3, 10-1, 10-2, 10-3, 11-1, 11-2, and 11-3. 
Possible reasons behind these significant pairwise groups are that older people 
(ie, groups 8, 9, 10, 11) are more likely than younger people (e., groups 1, 2, 3, 
4) to have underlying health problems, such as dementia, cardiovascular diseas- 
es, and diabetes. Children and young adults are not as prone to severe forms of 
COVID-19 compared to old people. So, COVID deaths are more likely to in- 
crease among older people, hence making the mean differences in these groups 
statistically significant as was shown by Tukey test and the graph of 95% fami- 
ly-wise confidence level. Compared to people under 45 years old, the chances of 
dying from COVID-19 are much higher among those aged 75 and beyond, there- 


fore the obtained results were evident in our analysis. 


6. Conclusion 


COVID-19 pandemic has affected all aspects of our daily life. It has changed 
how we work, learn, and interact. Since the start of the pandemic, there have 
been many guesses that men are contracting COVID-19 at a higher rate than 
women and are more likely to die from the disease. In order to investigate the 
role of gender and age groups in COVID mortality in the US, a two-way ANOVA 
was used in this paper to test whether men are more likely to die of COVID-19 
than women, and whether the age matters for COVID-19 deaths in the US. The 
response variable was number of COVID deaths in the US, and the independent 
variables were, age groups and sex. The two-way ANOVA was run with the in- 
teraction between age groups and sex. The results revealed that the variable “age 
groups” was statistically significant, as its p-value was very small and significant 
at 0.001 level. However, the variable “sex” and the interaction term were found 
to be insignificant, as their p-values were greater than alpha (0.05). These results 
indicate that the age group is the only factor that has a statistically significant ef- 
fect on COVID deaths. The sex and interaction term has no significant effects on 
COVID deaths. The ANOVA assumptions of homogeneity of variance, normal- 
ity, and independence of observations were tested and found to be met. A 
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post-hoc Tukey HSD test was conducted on levels of the significant variable “age 
groups” to identify the significant pairwise groups that can affect COVID deaths. 
The outcome from Tukey test showed that there were 26 significant pairwise age 
groups in the data. Also, we have shown by conducting the ANOVA test that the 
gender was insignificant factor on COVID deaths, which contradicts the widely 
spread notion that men are more susceptible to COVID deaths than women. Al- 
though men and women differ in their genetic makeup, immune responses, and 
hormones, however, they both can suffer from COVID-19 consequences. There 
might be other social factors contributing to the sex differences in COVID deaths, 
such as job types, behavioral patterns, and underlying health issues. Another rea- 
son might be that women tend to have stronger immune systems than men. Men 
also tend to engage in more risky behaviors and can ignore physical distancing, 
and they might not take COVID symptoms as seriously. Behaviors that impact 
lung health, such as smoking, also may play a role in the disease’s deadly impact 


on men. 
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